- Notifications
You must be signed in to change notification settings - Fork6
This is a set of MatLab functions to help oceanographers read/download/write/merge NetCDF files easily and flexibly.
License
HappySpring/Easy_NetCDF
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This is a set ofmatlab functions to make it easier for oceanographers to handle large sets ofNetCDF files. The functions are built on the low-level MatlabNetCDF library package.
- Load variables in a customed region across multiple files quickly.
- Enhanced feature for downloadeding NetCDF files via OpenDAP.
load variables
- Correct scales, offsets, missing values automatically
- Load time in "datenum" format (days since 0000-01-00 00:00:00)
- Load a subset of a variable by specifying the longitude, latitude, time, etc., directly.
- Load variable from multiple files automatically
- Save/load dimensional information to/from a cache file, which can improve the performance significantly when it reads from a large number of files.
File operations
- Extract a subset of a NetCDF file to a new one.
- Download netcdf files via OpenDAP easily and reliably.
- support large datasets, variable can be downloaded block by block
- Retry automatically after interruptions.
- Merge files by time
- Merge files by time (save mean values)
write NetCDF files
- Write simple NetCDF files quickly. (The support for writting NetCDF files is not the coral propose of this toolbox. Please use the MatlabNetCDF library package for writting complex files.
- This toolbox does not support groups. And there is no plan to add this feature in near future.
| Path | Notes |
|---|---|
| ./ | Functions for current version |
| ./private | Other functions called by this toolbox |
| ./Documents_and_demo | some documents file and the netcdf files used in demo |
| ./Archive | Some old codes, They shoud not be used. |
FUN_nc_varget_enhanced_region_2_multifile: Read a subset of a variable from multiple filesFUN_nc_varget_enhanced: Read a variable from one fileFUN_nc_varget_enhanced_region_2: Read a subset of a variable from one fileFUN_nc_get_time_in_matlab_format: Read time variable into "matlab unit" (days since 0000-01-00 00:00)FUN_nc_OpenDAP_with_limit: Download via OpenDAP. It support block-by-block download and automagical retry after internet errors.
You need to add this to the searching path of your Matlab. The subfolders (private,Documents_and_demo,Archive should not be added to the searching path.) It can be done by two ways:
Click "Home tab> Set Path". It will open a dialog for setting the path. Then, click "Add Folder...", add the root path of this package (the folder contains a lot of functions, includingFUN_nc_varget.m), then click "Save" near the bottom of the dialog.
Method 1 (recommended):
addpath('/path/to/Easy_netcdf/');savepathMethod 2:Matlab will run
startup.mduring boot automatically if the file exists. Thus, you can addaddpath('/path/to/Easy_netcdf/');to thestartup.mfile and make sure that thestartup.mis put in existing searching path. This provide more flexibility.
Several functions in this package were written for this purpose, all of which, exceptFUN_nc_varget, can be replaced byFUN_nc_varget_enhanced_region_2_multifile.
[New] You can load all variables from a single netcdf file by this
data= FUN_nc_load_all_variables(fn );
, wherefn is the name of the netcdf file.
or
data= FUN_nc_load_all_variables(fn,'time_var_name',var_time);
which will convert the time variable (var_time) to matlab unit (days since 0000-01-00 00:00) according to itsunits property.
data = FUN_nc_varget( filename, varname );
off_setwill not be appliedscalewill not be appliedmissing valueswill not be replaced by NaN.Loaded data will keep its original type as in the netcdf file.
- filename: path to a specific netcdf file
- varname : name of the variable to be read
- data: values read from the netcdf file.
data= FUN_nc_varget('Demo_SST_2001.nc','sst');
data = FUN_nc_varget_enhanced( filename, varname );
This is the recommended command for loading one variable from one file.
off_setwill be appliedscalewill be appliedmissing valueswill be replaced by nan- Data will be converted to
double.
- filename: path to a specific netcdf file
- varname : name of the variable to be read
- data: values read from the netcdf file.
data= FUN_nc_varget_enhanced('Demo_SST_2001.nc','sst');
data = FUN_nc_varget_enhanced_region( filename, varname, start, count, stride);
- Read a part of the domain.
off_setwill be appliedscalewill be appliedmissing valueswill be replaced by nan- Loaded data will be converted to
double
filename: path to a specific netcdf file
varname : name of the variable to be read
start, count, stride: same asthis document for
netcdf.getVardata: values read from the netcdf file.
% parametersfn='Demo_SST_2001.nc';lonlimit= [-110-20];latlimit= [1570];% read lon/latlon= FUN_nc_varget_enhanced(fn,'lon' );lat= FUN_nc_varget_enhanced(fn,'lat' );% calculate [start, count] from range for lon/lat.[x_start,x_count,xloc]= FUN_nc_varget_sub_genStartCount(lon,lonlimit);[y_start,y_count,yloc]= FUN_nc_varget_sub_genStartCount(lat,latlimit);nc_start= [x_start,y_start,0 ]% the third value is for time.nc_count= [x_count,y_count,1 ] nc_stride= [1,1,1];% load datadata= FUN_nc_varget_enhanced_region(fn,'sst',nc_start,nc_count,nc_stride);
[ out_dim, data ] = FUN_nc_varget_enhanced_region_2( filename, varname, dim_name, dim_limit, [time_var_name], [dim_varname] );
This can also be done by "FUN_nc_varget_enhanced_region_2_multifile.m".
- Load a part of the domain.
off_setwill be applied.scalewill be applied.missing valueswill be replaced by nan.- Loaded data will be converted to
double.
filename [char]: name of the NetCDF file (e.g., 'temp.nc')
varname [char]: name of the variable (e.g., 'sst' or 'ssh')
dim_name [cell]: name of dimensions related to the variable specified above, like {'lon'}, {'lon','lat'}, {'lon', 'lat', 'depth, 'time'}. Dimensions with customed limits must be listed here here. Other dimensions are optional.
dim_limit [cell]: limits of dimensions in a cell. (e.g., {[-85 -55 ], [-inf inf]}). Please provide limits in the same order as they are listed in
dim_name.time_var_name [char, optional]: name of the variable for time.
- If this is not empty, the limit for time in
dim_limitcan be given in a "matlab units" (days since 0000-01-00 00:00) or created bydatenum. - If this is not empty, the time in
out_dimwill be given in "matlab units" (days since 0000-01-00 00:00).
- If this is not empty, the limit for time in
dim_varname [cell, optional]: name of the variable defining the axis at each dimension.
by default, each axis is defined by a variable sharing the same name. For example, the axis
lonshould be accompanied by a variable namedlon. In such a case, thedim_varnameshould be left empty.If the axis is defined by a variable with a different name, the name of the variable should be specified manually here. For example, the meridional dimension in 'Demo_SST_2001.nc' is named "y". However, the latitude is defined by a variable
lat. In such a situation:fn='Demo_SST_2001.nc' dim_name={'lon','y'}; dim_limit= { [-110-20], [1570]}; time_var_name='time'; dim_varname={'lon','lat'} varname='sst' [out_dim,data ]= FUN_nc_varget_enhanced_region_2(fn,'sst',dim_name,dim_limit,time_var_name,dim_varname);
- "dim_varname{1} = nan" indicates that the axis is not defined by any variables in file. Thus, it will be defined as 1, 2, 3, ... Nx, where Nx is the length of the dimension.
- out_dim : dimension info (e.g., longitude, latitude, if applicable)
- data : data extracted from the given netcdf file.
- When
time_var_nameis not empty, the corresponding variable inout_dimis converted to the same format asdatenum. However, this unit conversion will never be applied to the output variabledata. If you want to read the time variable itself, please useFUN_nc_get_time_in_matlab_format.
- When
fn='Demo_SST_2001.nc';dim_name= {'lon','y','time'}; dim_limit= { [-110-20], [1570], [datenum(2001,5,1) datenum(2001,5,31)] }; time_var_name='time';dim_varname= {'lon','lat','time'};varname='sst';[out_dim,data ]= FUN_nc_varget_enhanced_region_2(fn,'sst',dim_name,dim_limit,time_var_name,dim_varname);% Plot%FUN_MAP_pcolor_lonlat_quick( lon, lat, data(:,:,1)');pcolor(out_dim.lon,out_dim.lat,data' );cbar=colorbar;shadinginterptitle(datestr(out_dim.time))
dim_varname{2} is set to nan to read the second record in time.
fn='Demo_SST_2001.nc';dim_name= {'y''time'}; dim_limit= { [050], [2,2] }; time_var_name= [];dim_varname= {'lat',nan}; varname='sst';[out_dim,data ]= FUN_nc_varget_enhanced_region_2(fn,'sst',dim_name,dim_limit,time_var_name,dim_varname);% Plotpcolor(out_dim.lon,out_dim.lat,data' );cbar=colorbar;shadinginterpaxisequal%title(datestr(out_dim.time))
[ out_dim, data_out ] = FUN_nc_varget_enhanced_region_2_multifile( filelist, varname, dim_name, dim_limit, merge_dim_name, time_var_name, dim_varname );
- Load a variable across several files.
- Load a part of the domain.
off_setwill be applied.scalewill be applied.missing valueswill be replaced by nan.- Loaded data will be converted to
double.
filelist [struct array]: name and folder of the NetCDF file filelist must include 2 attributes, name and folder. For each element of filelist (e.g. the ith one), the full path will be generated by fullfile( filelist(ith).folder, filelist(ith).name) It can also be a cell array contain paths of files, or a char matrix, each raw of which contains one path. varname [char]: name of the variable dim_limit_str [cell]: name of dimensions, like {'lon','lat'}. Dimensions with customed limits must be listed here here. Other dimensions are optional. dim_limit_limit [cell]: limits of dimensions, like {[-85 -55], [30 45]}. merge_dim_name [string]: name of the dimension in which the variables from different files will be concatenated. If merge_dim_name is empty, the variable will be concatenated after its last dimension. + Example 1: if you want to read gridded daily temperature given in [lon, lat, depth, time] from a set of files, and each file contains temperature in one day, the merge_dim_name should be 'time'. + Example 2: if you want to read gridded daily temperature given in [lon, lat, depth], in which time is not given explicitly in each file, you can leave merge_dim_name empty. time_var_name [char, optional]: name of the time axis + variable defined by this will be loaded into time in "matlab units" (days since 0000-01-00) + This is helpful for setting timelimit in a easy way, avoiding calculating the timelimit from units in netcdf files. For example, to read data between 02/15/2000 00:00 and 02/16/2000 00:00 from a netcdf file, which includes a time variable "ob_time" in units of "days since 2000-00-00 00:00", you need to set timelimit as [46 47] when time_var_name is empty. However, you should set timelimit as [datenum(2000,2,15), datenum(2000,2,16)] if the tiem_var_name is set to "ob_time".dim_varname [cell, optional]: name of the variable defining the axis at each dimension. + by default, each axis is defined by a variable sharing the same name as the dimension. + "dim_varname{1} = nan" indicates that the axis is not defined not defined by any variable in file. It will be defined as 1, 2, 3, ... Nx, where Nx is the length of the dimension. out_dim : dimension info (e.g., longitude, latitude, if applicable) data : data extracted from the given netcdf file.filelist= dir('Demo_*.nc');varname='sst';dim_name= {'time' };dim_limit= { [datenum(2001,12,1) datenum(2003,5,31)] };merge_dim_name='time';time_var_name='time';dim_varname= [];[out_dim,data ]= FUN_nc_varget_enhanced_region_2_multifile(filelist,varname,dim_name,dim_limit,merge_dim_name,time_var_name,dim_varname);
Note: filelist can also be a cell like this
filelist= {'Demo_SST_2001.nc''Demo_SST_2002.nc''Demo_SST_2003.nc''Demo_SST_2004.nc''Demo_SST_2005.nc''Demo_SST_2006.nc''Demo_SST_2007.nc''Demo_SST_2008.nc''Demo_SST_2009.nc''Demo_SST_2010.nc'};
or char array like this
filelist = ['Demo_SST_2001.nc' 'Demo_SST_2002.nc' 'Demo_SST_2003.nc' 'Demo_SST_2004.nc' 'Demo_SST_2005.nc' 'Demo_SST_2006.nc' 'Demo_SST_2007.nc' 'Demo_SST_2008.nc' 'Demo_SST_2009.nc' 'Demo_SST_2010.nc'];filelist= dir('Demo_*.nc');varname='sst';dim_name= {'lon','y','time' };% In the demo files, the meridional dimension is named as "y".dim_limit= { [-110-20], [1570], [datenum(2001,12,1) datenum(2003,11,30)] };merge_dim_name='time';% merge data in "time" dimension.time_var_name='time';% convert values in "time" to matlab units (days since 0000-01-00 00:00).dim_varname= {'lon','lat','time'};% This is to force the function to read values for the meridional dimension from the variable "lat".[out_dim,data ]= FUN_nc_varget_enhanced_region_2_multifile(filelist,varname,dim_name,dim_limit,merge_dim_name,time_var_name,dim_varname);
**Notes: ** ==
FUN_nc_gen_presaved_netcdf_infois replaced byFUN_nc_gen_presaved_netcdf_info_v2introduced in v1.50-beta.== The new version save the dimensional information in a new format ("v2"), dropping lots of unnecessary information. The output mat file is 10 times less than the previous one with a better performance and related functions have been updated to support the new format.
It might be slow to read a subset of data from hundreds of files by provide a list of all files forFUN_nc_varget_enhanced_region_2_multifile. The function needs to open every single file for some dimensional information. To speed up this command, an alternative way is to read and save the dimensional information at the very beginning. Then, providing the pre-saved information toFUN_nc_varget_enhanced_region_2_multifile.
The dimensional information can be generated byFUN_nc_gen_presaved_netcdf_info and an example is shown below:
%%generate info ----------------------------------------------------------- filelist= dir('Demo_*.nc'); merge_dim_name='time';% merge data in "time" dimension. dim_name= {'lon','y','time' };% In the demo files, the meridional dimension is named as "y". dim_varname= {'lon','lat','time'};% This is to force the function to read values for the meridional dimension from the variable "lat". time_var_name='time';% convert values in "time" to matlab units (days since 0000-01-00 00:00). This is optional output_file_path='Presaved_info_demo.mat';% Please note that **absolute** file paths are saved in the generated file. If you moved the data, you need to run this again pregen_info= FUN_nc_gen_presaved_netcdf_info_v2(filelist,merge_dim_name,dim_name,dim_varname,time_var_name,output_file_path);%%read data -------------------------------------------------------------- varname='sst'; dim_name= {'time' }; dim_limit= { [datenum(2001,12,1) datenum(2003,5,31)] }; merge_dim_name='time'; presaved_info= load(output_file_path); presaved_info=presaved_info.pregen_info; [out_dim,data_out ]= FUN_nc_varget_enhanced_region_2_multifile(presaved_info,varname,dim_name,dim_limit);
FUN_nc_OpenDAP_with_limit( filename0, filename1, dim_limit_var, dim_limit_val, var_download, var_divided, divided_dim_str, Max_Count_per_group, ... )
- support large dataset.
- Retry automatically after interruptions.
- Download data piece by piece.
- Download a subset of the original file.
filename0 : source of the netcdf file (OpenDAP URL here) filename1 : Name of output netcdf file dim_limit_var : which axises you want to set the limit dim_limit_val : the limit of each axises var_download : the variable you'd like to download. [var_download = [] will download all variables.] var_divided : the varialbes need to be downloaded piece by piece in a specific dimension. In many cases, OpenDAP will end up with no response if you try to donwloading too large data at once. A solution for this is to download data piece by piece divided_dim_str: which dim you'd like to download piece by piece (e.g., 'time', or 'depth'). divided_dim_str = [] means all varialbes will be downloaded completely at once. Max_Count_per_group: Max number of points in the divided dimension.| Parameter | Default value | note |
|---|---|---|
| dim_varname | dim_limit_name | Names of variables defining dimensions given in dim_limit_name |
| time_var_name | [] | Name of the variable describing time |
| is_auto_chunksize | false | Calculate chunk size by a function in this package (beta) |
| compression_level | 1 | |
| is_skip_blocks_with_errors | false | |
| N_max_retry | 10 | |
| var_exclude | [] |
N/ANotice: To recongnize the axis correctly, there must be one variablenamed as by the axis! Assign a variable to a specific axis is not supported yet.
% HYCOM dataset at an OpenDAP serverfilename0='http://tds.hycom.org/thredds/dodsC/GLBu0.08/expt_19.1/2012';% output filenamefilename1='HYCOM_test2.nc';% calculate time limits timelimit= [datenum(2012,1,1) datenum(2012,1,3)]; time= FUN_nc_varget(filename0,'time'); time_unit= FUN_nc_attget(filename0,'time','units'); [time0,unit_str,unit_to_day]= FUN_nc_get_time0_from_str(time_unit ); timelimit= (timelimit-time0)/unit_to_day ;% set limits lonlimit= [-76-70 ]; latlimit= [3239]; depthlimit= [0100]; dim_limit_var= {'lon','lat','depth','time'}; dim_limit_val= {lonlimit,latlimitdepthlimittimelimit};% variable to be downloaded var_download= {'water_temp','lon','lat','depth','time'};% empty indicates downloading all variables% Variables that should be downloaded block by block var_divided= {'water_temp'};% which dim you'd like to download block by block (e.g., 'time', or 'depth')divided_dim_str='depth'% max size of each "piece"Max_Count_per_group=5; FUN_nc_OpenDAP_with_limit(filename0,filename1,dim_limit_var,dim_limit_val,var_download,var_divided,divided_dim_str,Max_Count_per_group)
% HYCOM dataset at an OpenDAP serveropendapURL='http://tds.hycom.org/thredds/dodsC/GLBu0.08/expt_19.1/2012';% output filenamefilename_out='HYCOM_test1.nc';% calculate time limits timelimit= [datenum(2012,1,1) datenum(2012,1,3)]; time_varname='time';%Tell the code which variable contains time% set limits lonlimit= [-76-70 ]; latlimit= [3239]; depthlimit= [0100]; dim_limit_var= {'lon','lat','depth','time' }; dim_limit_val= {lonlimit,latlimit,depthlimit,timelimit};% variable to be downloaded var_download= {'water_temp','lon','lat','depth','time'};% empty indicates downloading all variables% Variables that should be downloaded block by block var_divided=var_download;% which dim you'd like to download block by block (e.g., 'time', or 'depth')divided_dim_str='depth';% max size of each "piece"N_divided_rec_per_group=2; FUN_nc_OpenDAP_with_limit(opendapURL,filename_out,dim_limit_var,dim_limit_val,var_download,var_divided,divided_dim_str,N_divided_rec_per_group,'time_var_name',time_varname);
FUN_nc_merge( input_dir, filelist, output_fn, merge_dim_name, compatibility_mode )
input_dir: The folder in which all input netcdf given by "filelist" is located filelist : the list of files which will be merged. This should be generated by matlab built-in command: `dir`. This function will merge the netcdf files following the order given in this variable. Please make sure this variable has been resorted properly. output_fn : Name of output netcdf file merge_dim_name : name of the dimension in which all varialbes will be merged. compatibility_mode: compatibility_mode = 1: write netCDF in 'CLOBBER'; Compression would be disabled. compatibility_mode = 0: write netCDF in 'NETCDF4'.N/A
Note
- To recognize the axis correctly, there must be one variable named as by the axis!
- Variables without the dimension
merge_dim_namewill be copied from the first file given in the variable filelist - The time in the merged file may not be correct if the time units vary between files.
% input_dir: path for the folder containing the files input_dir='.';% filelist filelist= dir(fullfile(input_dir,'Merge_Demo*.nc'));% output filename output_fn='Merged_Output.nc';% name of the demension to be merged.merge_dim_name='time';% compatibility_mode:% compatibility_mode = 1: write netCDF in 'CLOBBER'; Compression would be disabled.% compatibility_mode = 0: write netCDF in 'NETCDF4'.compatibility_mode=0;strvcat( filelist(:).name )FUN_nc_merge(input_dir,filelist,output_fn,merge_dim_name,compatibility_mode)
FUN_nc_merge_save_mean( input_dir, filelist, output_fn, merge_dim_name, compatibility_mode, list_var_excluded )
% input_dir: path for the folder containing the files input_dir='.';% filelist filelist= dir(fullfile(input_dir,'Merge_Demo*.nc'));% Output filename output_fn='Merged_Output_mean.nc';% Name of the demension to be merged. merge_dim_name='time';% compatibility_mode:% compatibility_mode = 1: write netCDF in 'CLOBBER';% compatibility_mode = 0: write netCDF in 'NETCDF4'; compatibility_mode=0;% Variable should not be included in the output file. list_var_excluded= [];% ExecuteFUN_nc_merge_save_mean(input_dir,filelist,output_fn,merge_dim_name,compatibility_mode,list_var_excluded);
FUN_nc_easywrite_enhanced( filename, dim_name, dim_length, varname, dimNum_of_var, data, global_str_att )
filename [char]: name of the output netcdf file (e.g., 'test.nc') dim_name [cell]: names of dimensions (e.g., {'lon','lat'} dim_length [array]: length of each dimension (e.g., [ 360, 180 ] ) varname [cell]: names of variables (e.g., {'lon','lat','sst','ssh'} dimNum_of_var [cell]: dimensional ID for each variable { 1, 2, [1,2],[1,2]} ). Value "1" indciate the first dimension in `dim_name`. value 2 indicate the second dimension in `dim_name`. data [cell]: values for each variable (e.g., {lon,lat,sst,ssh}) global_str_att: global attribute| Parameter | Default value | note |
|---|---|---|
| dim_varname | dim_limit_name | Names of variables defining dimensions given in dim_limit_name |
| time_var_name | [] | Name of the variable describing time |
| is_auto_chunksize | false | |
| compressiion_level | 1 | |
| is_skip_blocks_with_errors | false | |
| N_max_retry | 10 | |
| var_exclude | [] |
N/A% ---- generate random data ----lon= [-75:-55] ;lat= [26:55] ;depth= [0:5:100] ;temp= rand( length(lon), length(lat), length(depth) );%%---- write nctCDF ----filename='Test_random_values.nc';dim_name= {'lon','lat','depth'};dim_length= [length(lon), length(lat), length(depth)];varname= {'temp','lon','lat','depth'};dimNum_of_var= {[1,2,3],1,2,3 };data= {temp,lon,lat,depth };global_att='This is a test.';FUN_nc_easywrite_enhanced(filename,dim_name,dim_length,varname,dimNum_of_var,data,global_att)% FUN_nc_easywrite_enhanced('temp.nc',...% {'Node','Cell','time'},[1000 2000 500],...% {'node_lon','node_lat','lon_cell','lat_cell','sst'},{1,1,2,2,[1 3]},...% {lon_node,lat_node,lon_cell,lat_cell,sst},'This is an example');
*ncwriteschema would be a better choice to write a more complex NetCDF file from structures.
FUN_nc_easywrite_add_var: add a variable to an existing netcdf fileFUN_nc_easywrite_add_att: add an attribute to an existing variable in an existing netcdf file.FUN_nc_easywrite: write one variable into a new netcdf file.FUN_nc_easywrite_write_var: replace values of an existing variable in an existing netcdf file.
Output of some examples above can be found here
About
This is a set of MatLab functions to help oceanographers read/download/write/merge NetCDF files easily and flexibly.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.