Question about the dataset format #1

pangxincheng · 2023-11-09T07:41:02Z

Hi, thanks for your great work.
I want to deploy your work on my robotic arm, but I am not very clear about the format of the data. Specifically, I have checked the realworld dataset you provided and it has the following format:

{
    "data": {
        "demo_0": {
            "actions": shape (N, 7),
            "obs": {
                "agentview_rgb": shape (N, 128, 128, 3),
                "ee_states": shape (N, 16),
                "eye_in_hand_rgb": shape (N, 128, 128, 3),
                "gripper_states": shape (N, 1),
                "joint_states": shape (N, 7),
            }
        },
        ...
    }
}

I can understand the meanings of agentview_rgb, eye_in_hand_rgb, gripper_states, and joint_states, but I am not clear about the specific meanings represented by each value of actions, and ee_states.
Can you explain their meanings?
Thanks!

The text was updated successfully, but these errors were encountered:

pangxincheng · 2023-11-09T15:26:10Z

Hi, thanks for your great work. I want to deploy your work on my robotic arm, but I am not very clear about the format of the data. Specifically, I have checked the realworld dataset you provided and it has the following format:
{
    "data": {
        "demo_0": {
            "actions": shape (N, 7),
            "obs": {
                "agentview_rgb": shape (N, 128, 128, 3),
                "ee_states": shape (N, 16),
                "eye_in_hand_rgb": shape (N, 128, 128, 3),
                "gripper_states": shape (N, 1),
                "joint_states": shape (N, 7),
            }
        },
        ...
    }
}
I can understand the meanings of agentview_rgb, eye_in_hand_rgb, gripper_states, and joint_states, but I am not clear about the specific meanings represented by each value of actions, and ee_states. Can you explain their meanings? Thanks!

I have debugged the eval. py, and now I understand ee_states represents the transformation matrix, and actions means xyz+axis angle+gripper.

But when I load the real world data, there are certain differences in the data, Specifically, I find a difference between actions and ee_states. In my understanding, ee_states_{t+1} = ee_states_{t} + actions_{t}, but it does not hold true.

My code is as following:

import h5py
import numpy as np
from scipy.spatial.transform import Rotation as R
import robosuite.utils.transform_utils as trans

def calc_goal_rotmat(delta: np.ndarray, current_orientation: np.ndarray):
    quat_error = trans.axisangle2quat(delta)
    rotation_mat_error = trans.quat2mat(quat_error)
    goal_orientation = np.dot(rotation_mat_error, current_orientation)
    return goal_orientation

def scale_action(action: np.ndarray):
    """
    this function is used when I run the eval.py
    reference from robosuite.controllers.base_controller.Controller.scale_action 
    [robosuite](https://github.com/ARISE-Initiative/robosuite/blob/b9d8d3de5e3dfd1724f4a0e6555246c460407daa/robosuite/controllers/base_controller.py#L104)
    action: np.ndarray shape = (N, 6)
    """
    input_min = np.array([-1., -1., -1., -1., -1., -1.])
    input_max = np.array([1., 1., 1., 1., 1., 1.])
    output_max = np.array([0.05, 0.05, 0.05, 0.5 , 0.5 , 0.5 ])
    output_min = np.array([-0.05, -0.05, -0.05, -0.5 , -0.5 , -0.5 ])
    action_scale = np.abs(output_max - output_min) / np.abs(input_max - input_min)
    action_output_transform = (output_max + output_min) / 2.0
    action_input_transform = (input_max + input_min) / 2.0
    return (action - action_input_transform) * action_scale + action_output_transform

f = h5py.File("RW8_open_the_top_drawer_demo.hdf5", "r")  # load the read world data
actions = np.array(f["data/demo_0/actions"])
actions[:, :-1] = scale_action(actions[:, :-1])  # scale action

ee_states = np.array(f["data/demo_0/obs/ee_states"]).reshape(-1, 4, 4).transpose(0, 2, 1)  # ee pose in base frame (N, 4, 4)
ee_rotmat = ee_states[:, :3, :3]  # (N, 3, 3)

for i in range(ee_rotmat.shape[0] - 1):
    target_rotmat = calc_goal_rotmat(actions[i, 3:6], ee_rotmat[i])
    print("max rot difference:", np.abs(R.from_matrix(target_rotmat).as_rotvec() - R.from_matrix(ee_rotmat[i + 1]).as_rotvec()).max(), end=", ")
    print("max xyz difference:", np.abs(ee_states[i + 1, :3, 3] - (ee_states[i, :3, 3] + actions[i, :3])).max())
f.close()

The output is as follows:

max rot difference: 0.06717731695839074, max xyz difference: 0.026428571428571412
max rot difference: 0.12910619805085455, max xyz difference: 0.017571192556700033
max rot difference: 0.12899400616200962, max xyz difference: 0.016694306660440306
max rot difference: 0.1273171033890601, max xyz difference: 0.014875318641103308
max rot difference: 0.11722738636511609, max xyz difference: 0.01575476722665664
max rot difference: 0.09960029660031305, max xyz difference: 0.020507130148286645
max rot difference: 0.0881736445381789, max xyz difference: 0.024517996159409694
max rot difference: 0.08918299972444184, max xyz difference: 0.02454872393192914
max rot difference: 0.0917978538396374, max xyz difference: 0.025392810946219313
max rot difference: 0.10303586422841801, max xyz difference: 0.023936941918841825
max rot difference: 0.10536312644094174, max xyz difference: 0.023651353628730443
max rot difference: 0.10443637685785037, max xyz difference: 0.01868286831938254
max rot difference: 0.1047047346610528, max xyz difference: 0.015359910272096378
...

I think there may be a problem with the parameters in the function scale_action.
Can you help me solve it? @ShahRutav
Thanks.

ShahRutav · 2023-11-13T05:10:38Z

Hi,

Thanks for showing interest in our work. The equation ee_states_{t+1} = ee_states_{t} + actions_{t} might have a relatively higher error since the robot might not fully achieve action_{t} before predicting action for the next time step.

On a side note, we do not use ee_states in our code.

pangxincheng · 2023-11-13T09:19:02Z

Thanks for resolving my doubts😄. @ShahRutav
I have another question to ask you. I would like to ask if the scale of actions also uses the same parameters as robotsuite in the real world? (i.e. the following parameters)

input_min = np.array([-1., -1., -1., -1., -1., -1.])
input_max = np.array([1., 1., 1., 1., 1., 1.])
output_max = np.array([0.05, 0.05, 0.05, 0.5 , 0.5 , 0.5 ])
output_min = np.array([-0.05, -0.05, -0.05, -0.5 , -0.5 , -0.5 ])

It is important for deploying the MUTEX in my robotic arm.
Looking forward to your answer.

ShahRutav · 2023-11-14T22:43:57Z

Hi,

Due to hardware constraints, the action scaling can be very different in simulation and real-world.
We use deoxys controller for our real-world experiments. The FrankaInterface class might be of interest.

I can attach the controller configs if that is helpful (as soon as I can access the robot). Let me know if you need it!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the dataset format #1

Question about the dataset format #1

pangxincheng commented Nov 9, 2023

pangxincheng commented Nov 9, 2023

ShahRutav commented Nov 13, 2023

pangxincheng commented Nov 13, 2023

ShahRutav commented Nov 14, 2023

Question about the dataset format #1

Question about the dataset format #1

Comments

pangxincheng commented Nov 9, 2023

pangxincheng commented Nov 9, 2023

ShahRutav commented Nov 13, 2023

pangxincheng commented Nov 13, 2023

ShahRutav commented Nov 14, 2023