- Notifications
You must be signed in to change notification settings - Fork4
A modern C++ header only cdf library with Python bindings
License
SciQLop/CDFpp
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Linux x86_64 | Windows x86_64 | MacOs x86_64 | MacOs ARM64 |
---|---|---|---|
Linux x86_64 | Windows x86_64 | MacOs x86_64 |
---|---|---|
A NASA'sCDF modern C++ library.This is not a C++ wrapper but a full C++ implementation.Why? CDF files are still used for space physics missions but few implementations are available.The main one is NASA's C implementation availablehere but it lacks multi-threads support (global shared state), has an old C interface and has a license which isn't compatible with most Linux distributions policy.There are also Java and Python implementations which are not usable in C++.
List of features and roadmap:
- CDF reading
- read files from cdf version 2.2 to 3.x
- read uncompressed file headers
- read uncompressed attributes
- read uncompressed variables
- read variable attributes
- loads cdf files from memory (std::vector or char*)
- handles both row and column major files
- read variables with nested VXRs
- read compressed files (GZip, RLE)
- read compressed variables (GZip, RLE)
- read UTF-8 encoded files
- read ISO 8859-1(Latin-1) encoded files (converts to UTF-8 on the fly)
- variables values lazy loading
- decode DEC's floating point encoding (Itanium, ALPHA and VAX)
- pad values
- CDF writing
- write uncompressed headers
- write uncompressed attributes
- write uncompressed variables
- write compressed variables
- write compressed files
- pad values
- General features
- useslibdeflate for faster GZip decompression
- highly optimized CDF reads (up to ~4GB/s read speed from disk)
- handle leap seconds
- Python wrappers
- Documentation
- Examples (see below)
- Benchmarks
If you want to understand how it works, how to use the code or what works, you may have to read tests.
python3 -m pip install --user pycdfpp
meson buildcd buildninjasudo ninja install
Or if youl want to build a Python wheel:
python -m build.# resulting wheel will be located into dist folder
Basic example from a local file:
importpycdfppcdf=pycdfpp.load("some_cdf.cdf")cdf_var_data=cdf["var_name"].values#builds a numpy view or a list of stringsattribute_name_first_value=cdf.attributes['attribute_name'][0]
Note that you can also load in memory files:
importpycdfppimportrequestsimportmatplotlib.pyplotasplttha_l2_fgm=pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)plt.plot(tha_l2_fgm["tha_fgl_gsm"])plt.show()
Buffer protocol support:
importpycdfppimportrequestsimportxarrayasxrimportmatplotlib.pyplotasplttha_l2_fgm=pycdfpp.load(requests.get("https://spdf.gsfc.nasa.gov/pub/data/themis/tha/l2/fgm/2016/tha_l2_fgm_20160101_v01.cdf").content)xr.DataArray(tha_l2_fgm['tha_fgl_gsm'],dims=['time','components'],coords={'time':tha_l2_fgm['tha_fgl_time'].values,'components':['x','y','z']}).plot.line(x='time')plt.show()# Works with matplotlib directly tooplt.plot(tha_l2_fgm['tha_fgl_time'],tha_l2_fgm['tha_fgl_gsm'])plt.show()
Datetimes handling:
importpycdfppimportos# Due to an issue with pybind11 you have to force your timezone to UTC for# datetime conversion (not necessary for numpy datetime64)os.environ['TZ']='UTC'mms2_fgm_srvy=pycdfpp.load("mms2_fgm_srvy_l2_20200201_v5.230.0.cdf")# to convert any CDF variable holding any time type to python datetime:epoch_dt=pycdfpp.to_datetime(mms2_fgm_srvy["Epoch"])# same with numpy datetime64:epoch_dt64=pycdfpp.to_datetime64(mms2_fgm_srvy["Epoch"])# note that using datetime64 is ~100x faster than datetime (~2ns/element on an average laptop)
Creating a basic CDF file:
importpycdfppimportnumpyasnpfromdatetimeimportdatetimecdf=pycdfpp.CDF()cdf.add_attribute("some attribute", [[1,2,3], [datetime(2018,1,1),datetime(2018,1,2)],"hello\nworld"])cdf.add_variable(f"some variable",values=np.ones((10),dtype=np.float64))pycdfpp.save(cdf,"some_cdf.cdf")
#include"cdf-io/cdf-io.hpp"#include<iostream>std::ostream&operator<<(std::ostream& os,const cdf::Variable::shape_t& shape){ os <<"(";for (auto i =0; i <static_cast<int>(std::size(shape)) -1; i++) os << shape[i] <<',';if (std::size(shape) >=1) os << shape[std::size(shape) -1]; os <<")";return os;}intmain(int argc,char** argv){auto path =std::string(DATA_PATH) +"/a_cdf.cdf";// cdf::io::load returns a optional<CDF>if (constauto my_cdf =cdf::io::load(path); my_cdf) { std::cout <<"Attribute list:" << std::endl;for (constauto& [name, attribute] : my_cdf->attributes) { std::cout <<"\t" << name << std::endl; } std::cout <<"Variable list:" << std::endl;for (constauto& [name, variable] : my_cdf->variables) { std::cout <<"\t" << name <<" shape:" << variable.shape() << std::endl; }return0; }return -1;}
- NRV variables shape, in order to expose a consistent shape, PyCDFpp exposes the reccord count as first dimension and thus its value will be either 0 or 1 (0 mean empty variable).
About
A modern C++ header only cdf library with Python bindings