- Notifications
You must be signed in to change notification settings - Fork584
Support LeRobotDataset v3.0#11931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
github-actionsbot commentedNov 20, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Web viewer built successfully.
View image diff onkitdiff. Note: This comment is updated whenever you push a commit. |
ac455f4 to7ea4fc4Comparegithub-actionsbot commentedNov 24, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Latest documentation preview deployed successfully.
Note: This comment is updated whenever you push a commit. |
ntjohnson1 left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Generally looks good to me.
If you're still wrapping things up I think it would be nicer to review/easier to land in chunks.
I think the video utils and codec cleanup is really small and non-controversial. Then can probably land the abstraction with v2 which should have minimal functional impact. Then just land the addition of v3.
Otherwise my only other thought was that there doesn't seem to be much test coverage over these areas. But that kind of seemed to be true before 🤷
| }; | ||
| letmut output =Vec::new(); | ||
| write_avc_chunk_to_nalu_stream(avc1_box,&mut output, chunk, annexb_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
You don't change this in your PR but it feels weird to me to have mutable output in the middle of the arguments. Does rust have a similar convention to C to usually put these at the end?
| /// | ||
| /// MP4 stores H.264/H.265 samples using AVCC/HVCC length-prefixed NALs and relies on container | ||
| /// metadata for SPS/PPS/VPS. | ||
| pubfnsample_data_in_stream_format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is probably really nice generally! Potentially worth putting a backlog ticket to make this more accessible in the SDK to simplify mp4 to video stream conversion
| /// Columns in the `LeRobot` dataset schema that we do not visualize in the viewer, and thus ignore. | ||
| pubconstLEROBOT_DATASET_IGNORED_COLUMNS:&[&str] = | ||
| &["episode_index","index","frame_index","timestamp"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Just because we don't view them but it might be nice to have them to query against.
| /// # Important | ||
| /// | ||
| /// Currently, this only supports v2 `LeRobot` datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
NIT: Seems redundant now that the struct is named v2
179b6ed to5c635f7Compare
Uh oh!
There was an error while loading.Please reload this page.
Related
What
This adds support for the LeRobotDataset v3.0 (huggingface/lerobot#1412) format to our LeRobot dataset loader. Additionally this now loads feature statistics such as min, max, mean values as raw arrow data, making it easier to query data out of a loaded dataset if needed.
Still left to do before this can land: