Memory management in MATLAB

Tyler Dare, 22 May 2015

Many high-channel-count measurements output files that are too large to load into memory at one time. Here are some strategies for loading, saving, and processing large files without crashing your system or generating OUT OF MEMORY errors.

Getting machine RAM programmatically
Saving data as singles instead of doubles
Load only parts of *.mat files
Random access to *.mat files

Getting machine RAM programmatically

It can be helpful to get the RAM of your machine programmatically, especially if your code will be run on multiple systems.

[user, system] = memory; % Get memory statistics on your MATLAB session and the current system
bytes = system.PhysicalMemory.Available; % Get the RAM (in bytes) that isn't being used by other programs
available_RAM_GB = bytes/1024^3

available_RAM_GB =

    9.4970

I have had good luck limiting variables to about 25% of available RAM.

Many times, subfunctions can create duplicate copies of variables temporarily and double RAM usage unexpectedly. For more information, see

http://www.mathworks.com/help/matlab/matlab_prog/strategies-for-efficient-use-of-memory.html

Saving data as singles instead of doubles

Most data acquisition systems (including National Instruments) are 24-bit systems. Single-precision floating-point numbers are 32 bits, so no information will be lost if data is saved as singles instead of doubles, thereby cutting disk space usage in half.

You can initialize an array of singles using

S = zeros(1000,1000,'single');

Or if you want an array of complex singles (for example, from a CPSD matrix) you can use

S = zeros(1000,1000,'like',single(1i));

Load only parts of *.mat files

If the *.mat file you want to read has multiple large variables in it, you can read only some of them. First, find out what variables are in the file:

whos -file myDataFile

  Name             Size                Bytes  Class     Attributes

  dataSet1      1000x1000            4000000  single              
  dataSet2      1000x1000            4000000  single

Then for each large variable you can load it, do some processing, and clear it before moving on to the next:

load('myDataFile','dataSet1')
m(1) = norm(dataSet1);
clear dataSet1

load('myDataFile','dataSet2')
m(2) = norm(dataSet2);
clear dataSet2

Random access to *.mat files

With recent versions of MATLAB, you can do random access to *.mat files, which allows you to load only parts of variables. The *.mat file must be saved in the correct format.

Use the option 'v7.3' to save as the version that allows random access:

fs = 44100;
t = 0:1/fs:100;
x = sin(2*pi*100*t); % Generate 100 s long time history.
save('longTimeHistory','fs','t','x','-v7.3') % Save the large data file using the v7.3 option.
clear all

To load only part of the time history, first create a mat object

matObj = matfile('longTimeHistory')

matObj = 

  matlab.io.MatFile

  Properties:
      Properties.Source: 'Z:\Notes\longTimeHistory.mat'
    Properties.Writable: false                         
                     fs: [1x1       double]            
                      t: [1x4410001 double]            
                      x: [1x4410001 double]

Use the mat object like a structure to access the variables

fs = matObj.fs; % Read the sample rate
dataChunk = matObj.x(1,1:fs); % Read the first second of the data into memory.
timeChunk = matObj.t(1,1:fs); % Read the first second of the time vector into memory. Note that even though t is a vector, you must specify both dimensions.

mat objects can be read only in contiguous chunks of data, and using the 'end' keyword causes MATLAB to load the entire variable. For more information, see

http://www.mathworks.com/help/matlab/ref/matfile.html

One important note is that v7.3 files are row-major, instead of column-major like most MATLAB storage. This can make a make a big difference in the speed of loading variables.

A = rand(10000,10000);
save('largeDataSet','A','-v7.3'); % Create a large variable and save it
clear A

matObj = matfile('largeDataSet');
for ii = 1:10
    tic
    dataColumn = matObj.A(:,ii); % Load the first 10 columns of A
    tLoadColumn(ii) = toc; % Measure how long column reading takes

    tic
    dataRow = matObj.A(ii,:); % Load the first 10 rows of A
    tLoadRow(ii) = toc; % Measure how long row reading takes
end

averageColumnReadTime = mean(tLoadColumn)
averageRowReadTime = mean(tLoadRow)

averageColumnReadTime =

    2.6365


averageRowReadTime =

    0.6252

For this case, row reading went significantly faster than column reading on average. If reading is much slower than you want it to be for large data sets, it may be worth it to transpose some of your large matrices so you can read them efficiently.

Memory management in MATLAB

Contents

Getting machine RAM programmatically

Saving data as singles instead of doubles

Load only parts of *.mat files

Random access to *.mat files