Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Support LeRobotDataset v3.0#11931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
oxkitsune wants to merge14 commits intomain
base:main
Choose a base branch
Loading
fromgijs/lerobot-datasetv3
Draft

Conversation

@oxkitsune
Copy link
Member

@oxkitsuneoxkitsune commentedNov 20, 2025
edited
Loading

Related

What

This adds support for the LeRobotDataset v3.0 (huggingface/lerobot#1412) format to our LeRobot dataset loader. Additionally this now loads feature statistics such as min, max, mean values as raw arrow data, making it easier to query data out of a loaded dataset if needed.

Still left to do before this can land:

  • Make sure the checklist item still passes, and add a v3 dataset item
  • Another pass over the new code structure, I moved lots of things around and have a feeling some things aren't optimal
  • Compatibility testing, so far this has been tested onlerobot/aloha_mobile_cabinet andlerobot/aloha_mobile_elevator. I want to try some third party datasets as well.

@oxkitsuneoxkitsune added 📺 re_vieweraffects re_viewer itself include in changelog feat-dataloaderEverything related to data loaders labelsNov 20, 2025
@github-actions
Copy link

github-actionsbot commentedNov 20, 2025
edited
Loading

Web viewer built successfully.

ResultCommitLinkManifest
5c635f7https://rerun.io/viewer/pr/11931+nightly+main

View image diff onkitdiff.

Note: This comment is updated whenever you push a commit.

@oxkitsuneoxkitsuneforce-pushed thegijs/lerobot-datasetv3 branch 3 times, most recently fromac455f4 to7ea4fc4CompareNovember 24, 2025 10:42
@github-actions
Copy link

github-actionsbot commentedNov 24, 2025
edited
Loading

Latest documentation preview deployed successfully.

ResultCommitLink
5c635f7https://landing-c9a2n5f29-rerun.vercel.app/docs

Note: This comment is updated whenever you push a commit.

Copy link
Member

@ntjohnson1ntjohnson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Generally looks good to me.
If you're still wrapping things up I think it would be nicer to review/easier to land in chunks.

I think the video utils and codec cleanup is really small and non-controversial. Then can probably land the abstraction with v2 which should have minimal functional impact. Then just land the addition of v3.

Otherwise my only other thought was that there doesn't seem to be much test coverage over these areas. But that kind of seemed to be true before 🤷

};

letmut output =Vec::new();
write_avc_chunk_to_nalu_stream(avc1_box,&mut output, chunk, annexb_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You don't change this in your PR but it feels weird to me to have mutable output in the middle of the arguments. Does rust have a similar convention to C to usually put these at the end?

///
/// MP4 stores H.264/H.265 samples using AVCC/HVCC length-prefixed NALs and relies on container
/// metadata for SPS/PPS/VPS.
pubfnsample_data_in_stream_format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is probably really nice generally! Potentially worth putting a backlog ticket to make this more accessible in the SDK to simplify mp4 to video stream conversion


/// Columns in the `LeRobot` dataset schema that we do not visualize in the viewer, and thus ignore.
pubconstLEROBOT_DATASET_IGNORED_COLUMNS:&[&str] =
&["episode_index","index","frame_index","timestamp"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Just because we don't view them but it might be nice to have them to query against.

Comment on lines 83 to 85
/// # Important
///
/// Currently, this only supports v2 `LeRobot` datasets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

NIT: Seems redundant now that the struct is named v2

@oxkitsuneoxkitsune added the do-not-mergeDo not merge this PR labelNov 25, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@ntjohnson1ntjohnson1ntjohnson1 left review comments

Assignees

No one assigned

Labels

do-not-mergeDo not merge this PRfeat-dataloaderEverything related to data loadersinclude in changelog📺 re_vieweraffects re_viewer itself

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Can't load lerobot v3.0 datasets

3 participants

@oxkitsune@ntjohnson1

[8]ページ先頭

©2009-2025 Movatter.jp