Add feature to drop last n keyframes for delta_timestamps #129

alexander-soare · 2024-05-03T17:11:38Z

What does this PR do?

Fixes a discrepancy between data loading in the original diffusion policy work and LeRobot.

The functionality is summarized in diffusion.yaml. I will copy it here verbatim for reference:

  # For each training batch we want (consider n_obs_steps=2, horizon=16):
  # t           | -1,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11, 12, 13, 14
  # action      |  a,  a,  a,  a,  a,  a,  a,  a,  a,  a,  a,   a,  a,  a,  a,  a
  # observation |  o,  o,   ,   ,   ,   ,   ,   ,   ,   ,   ,    ,   ,   ,   ,
  # Note that at rollout we only use some of the actions (consider n_action_steps=8):
  # action used |   ,  a,  a,  a,  a,  a,  a,  a,  a,   ,   ,    ,   ,   ,   ,
  delta_timestamps:
    observation.image: "[i / ${fps} for i in range(1 - ${policy.n_obs_steps}, 1)]"
    observation.state: "[i / ${fps} for i in range(1 - ${policy.n_obs_steps}, 1)]"
    action: "[i / ${fps} for i in range(1 - ${policy.n_obs_steps}, 1 - ${policy.n_obs_steps} + ${policy.horizon})]"

  # The original implementation doesn't sample keyframes for the last 7 steps. This is because, as described
  # above, the last 7 actions from the diffusion model are not used.
  n_end_keyframes_dropped: ${policy.horizon} - ${policy.n_action_steps} - ${policy.n_obs_steps} + 1

How was it tested?

This change improves eval metrics slightly (blue = new).

I also ran a whole epoch of data loading with LeRobot and the original diffusion policy code side by side, checking batch equality with inter-process communication.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

This change is

…ames

radekosmulski · 2024-05-27T09:07:46Z

lerobot/common/datasets/lerobot_dataset.py

+        # we drop those indices pertaining to the last n frames of each episode.
+        self.index = []
+        for from_ix, to_ix in zip(*self.episode_data_index.values(), strict=True):
+            self.index.extend(list(range(from_ix, to_ix - n_end_keyframes_dropped)))


This will leave self.hf_dataset["index'] and self.hf_dataset["episode_index"] in an inconsistent state with self.index and self.episode_data_index.

One problem is that we are duplicating the information that exists on the hf_dataset, namely hf_dataset["index"] so that we can mutate it. So instead of LeRobotDataset being a lightweight wrapper around an hf_dataset we now introduce this new state in between (self.index) that the maintainers and the users will have to know and remember about.

Another problem is that what should self.episode_data_index now reflect? self.episode_data_index now holds info for the data in hf_dataset but not for self.index, this can be confusing.

And the dependencies on this new introduced state only cascade down, where num_samples no longer depends directly on self.hf_dataset, etc.

So the intentions here are very good -- we want to make the life better for the users (so that they can train better models) and in some sense this implementation adds the new functionality in as straightforward way as possible, but the side effect here is that the complexity of the LeRobotDataset grows quite significantly and it is not clear anymore what what depends on. This can lead to subtle bugs and will make extending the functionality much harder down the road (and also new users to the codebase will have a much harder time understanding how things work -- for instance, what is the difference between self.hf_dataset["index"] and self.index and under what conditions they are different? Yeah, I know the answers to these questions but that is because I spent a good bit of time with this code now, but for new folks that might be quite challenging)

One potential way of tackling this might be to attack it via sampling -- in the same way as we have drop_last on the pytorch DataLoader. We could have a custom sampler we could pass to the dataloader that would drop the last n examples from each episode based on the episode_data_index.

The advantage here is that we would not be adding complexity to the LeRobotDataset. It would remain truer to its original intention, that is not as a way to modify data but as a way to add to a HuggingFace dataset in order to cater to the needs of training policies for controlling robots. But the dependency would be one way -- LeRobotDataset would depend on the hf_dataset it gets passed but the dependency would not go the other way. For all intents and purposes data contained in the hf_dataset would remain immutable, we would not be modifying the data in the hf_dataset nor would we be creating some state standing in for aspects of the hf_dataset. Our users would know where the single point of truth always is.

And also the full functionality would be localized to a Sampler. Yeah, if users would want to make avail of this functionality they would need to know of the existence of the Sampler (maybe a good place to introduce this would be as an example in the doc string for DiffusionPolicy) but we would not be adding extra complexity to the LeRobotDataset and arguably might be fighting complexity in the codebase via reducing coupling and doing something like dependency injection at the level of the dataloader.

Sorry, these are just some loose thoughts, though I understand that this problem is not easy to solve. But maybe some of this might be helpful.

I'll keep thinking about this and if anything comes to mind will comment again in his PR 🙂

Just for reference, here is a SubsetRandomSampler from pytorch.

Might be nice to inherit from it and instatiate it with episode_data_index and n_end_keyframes_dropped. A minimal implementation might be utils function that would return an index from a LeRobotDataset with the n_end_keyframes_dropped excluded.

Anyhow, sorry 🙂 Not sure if any of this is helpful, I might be completely off the mark with the intention behind this PR.

Either way, it is very good to know of this difference vs the original training on pusht, so thank you for pointing me to this PR, @alexander-soare! 🙌

@radekosmulski I find your arguments very convincing :) Would you be interested in having a crack at the sampler approach by any chance? cc @Cadene who had this on his TODO list at some point but it got shoved back by other prios.

sure thing @alexander-soare, will give it a go! 🙂

alexander-soare requested a review from Cadene May 3, 2024 17:11

squash commit

b699a2f

alexander-soare force-pushed the alexander-soare/add_drop_last_keyframes branch from 5ac046b to b699a2f Compare May 5, 2024 17:50

alexander-soare added 2 commits May 20, 2024 09:24

Merge remote-tracking branch 'upstream/main' into add_drop_last_keyfr…

bceab3d

…ames

Merge remote-tracking branch 'upstream/main' into add_drop_last_keyfr…

946d191

…ames

Cadene assigned Cadene and alexander-soare May 27, 2024

Cadene added ✨ Enhancement New feature or request 🧠 Policies Something policies-related labels May 27, 2024

radekosmulski reviewed May 27, 2024

View reviewed changes

radekosmulski mentioned this pull request May 28, 2024

add EpisodeAwareSampler #217

Merged

alexander-soare marked this pull request as draft May 30, 2024 08:09

alexander-soare closed this in #217 May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature to drop last n keyframes for delta_timestamps #129

Add feature to drop last n keyframes for delta_timestamps #129

alexander-soare commented May 3, 2024 •

edited

radekosmulski May 27, 2024 •

edited

radekosmulski May 27, 2024 •

edited

alexander-soare May 28, 2024

radekosmulski May 28, 2024

Add feature to drop last n keyframes for delta_timestamps #129

Add feature to drop last n keyframes for delta_timestamps #129

Conversation

alexander-soare commented May 3, 2024 • edited

What does this PR do?

How was it tested?

Who can review?

radekosmulski May 27, 2024 • edited

Choose a reason for hiding this comment

radekosmulski May 27, 2024 • edited

Choose a reason for hiding this comment

alexander-soare May 28, 2024

Choose a reason for hiding this comment

radekosmulski May 28, 2024

Choose a reason for hiding this comment

alexander-soare commented May 3, 2024 •

edited

radekosmulski May 27, 2024 •

edited

radekosmulski May 27, 2024 •

edited